Optimizing Document Classification: Unleashing the Power of Genetic Algorithms
نویسندگان
چکیده
Many individuals, including researchers, professors, and students, encounter difficulties when searching for scholarly documents, papers, journals within a specific domain. Consequently, scholars have begun to focus on document classification problem, offering various methods address this issue. Researchers utilized diverse data sources, such as citations, metadata, content, hybrids, in their approaches.In these the meta-data-based approach stands out research paper due its availability at no cost. Various employed different metadata parameters of articles, title, abstract, keywords, general terms, classification. In study, we chose four features as, keyword, terms from SANTOS dataset, which was prepared by ACM. To represent numerically, semantic-based model called BERT instead commonly used count-based models. generates 768-dimensional vector each record, introduces significant time complexity during computation. Additionally, our proposed optimizes using genetic algorithm. Optimal feature selection performances crucial role domain, enhancing overall accuracy system while reducing associated with selecting most relevant large-dimensional space. For purposes, GNB SVM classifiers. The outcomes study exposed that combination title keywords outperformed other combinations.
منابع مشابه
analysis of power in the network society
اندیشمندان و صاحب نظران علوم اجتماعی بر این باورند که مرحله تازه ای در تاریخ جوامع بشری اغاز شده است. ویژگیهای این جامعه نو را می توان پدیده هایی از جمله اقتصاد اطلاعاتی جهانی ، هندسه متغیر شبکه ای، فرهنگ مجاز واقعی ، توسعه حیرت انگیز فناوری های دیجیتال، خدمات پیوسته و نیز فشردگی زمان و مکان برشمرد. از سوی دیگر قدرت به عنوان موضوع اصلی علم سیاست جایگاه مهمی در روابط انسانی دارد، قدرت و بازتولید...
15 صفحه اولOptimizing the Pre-Processing Phase of Automatic e-Document Classification
Electronic documents such as e-catalogs, e-mails, and Web documents have their own distinct characteristics that can be utilized in search and classification. They are structured, noisy, and, in some cases, related to each other. We analyze the characteristics of three major types of e-documents e-catalogs, e-mails, and Web documents and propose methods for optimizing automatic classification o...
متن کاملCEEDs: Unleashing the Power of the Subconscious
The Collective Experience of Empathic Data Systems (CEEDs) project aims to offer a solution to the data deluge problem. With theoretical foundations in consciousness, information processing and creative discovery, the project proposes to develop a data analysis tool that harnesses and interprets the unconscious processes that influence our understanding of the world. Implicit reactions to immer...
متن کاملInfotopia: Unleashing the Democratic Power of Transparency*
In Infotopia, citizens enjoy a wide range of information about the organizations upon which they rely for the satisfaction of their vital interests. The provision of that information is governed by principles of democratic transparency. Democratic transparency both extends and critiques current enthusiasms about transparency. It urges us to conceptualize information politically, as a resource t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2023
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2023.3292248